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1.  Introduction. 

Consider  a  Brownian  motion  process  I7(t),  0  <  t  <  oo,  which  during  the  time  interval 
[0,t>]  has  drift  0  and  during  (i/,  oo)  has  drift  (i  >  0,  where  v  <  oo  and  /i  are  unknown 
parameters.  We  seek  a  stopping  rule  T  which  “detects"’  the  change  point  ;/  “as  soon  as 
possible.”  For  example,  W(t)  may  represent  the  cumulative  output  of  an  industrial  process, 
which  is  under  control  so  long  as  the  average  output  is  0,  but  which  may  go  out  of  control  and 
then  must  be  corrected  as  soon  as  possible.  Other  domains  of  application  are  to  maintaining 
quality  of  repeated  assays  (Wilson,  et  a/.,  1970)  and  surveillance  of  birth  records  for  a 
possible  increase  of  genetic  malformations  (e.g.  Weatherall  and  Haskey,  i'.;76). 

Let  Fv  denote  probability  when  the  change  occurs  at  time  v  («/  <  oo).  (TL  -  dep<  adence 
on  n  is  suppressed.  When  it  seems  desirable  to  emphasize  this  dependence,  we  shall  write 
Note  that  Px  =  P0.o  )  A  stopping  rule  T  to  detect  the  change  point  should  have  a 
large  value  for  Eoc[T),  i.e.  if  no  change  occurs,  the  expected  time  until  one  i3  “detected- 
should  be  large.  Subject  to  EX(T)  being  large,  a  good  detection  rule  should  in  some  sense 
have  small  values  of  EV(T  —  v  j  T  >  v),  i.e.  the  time  after  a  change  occurs  until  it  is 
detected  should  be  small.  Some  common  detection  rules  (including  those  considered  in  this 
paper)  satisfy  sup„  EV(T  —  u  \T  >  v)  —  Eq(T).  in  which  case  a  detection  rule  can  to  some 
extent  be  evaluated  in  terms  of  its  Average  Run  Lengths:  EX[T),  which  should  he  large, 
and  Eq{T),  which  should  be  small. 
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One  possible  solution  to  the  detection  problem  is  given  by  the  so-called  cusum  tests  pro¬ 
posed  by  Page  (1954).  For  a  systematic  discussion  of  these  procedures,  see  van  Dobben  de 
Bruyn  (1968).  An  outstanding  contribution  to  the  substantial  literature  on  cusum  processes 
is  Lorden  (1971),  who  shows  that  they  are  asymptotically  optimal  when  E~,(T)  is  infinitely 
large. 

Shiryayev  (1963)  and  Roberts  (1966)  independently  proposed  the  same  competitor  to 
cusum  tests.  Recently  Poliak  (1984)  has  proven  an  optimality  property  for  the  Shiryayev- 
Roberts  rule  (in  discrete  time  -  see  Appendix  A  for  a  brief  discussion  of  this  result  in  the 
present  setting)  which  seems  considerably  stronger  than  Lorden’s  asymptotic  optimality  of 
cusum  stopping  rules. 

The  purpose  of  the  present  paper  is  to  make  a  quantitative  comparison  of  the  Sbiry  yev- 
Roberts  and  Page  procedures.  We  do  this  in  the  context  of  continuous  time  in  order  to  use 
the  machinery  of  diffusion  processes  to  perform  explicitly  certain  calculations,  which  seem 
impossible  in  discrete  time.  -(SeeT  however,  Poliak,  1983,  who  makes  considerable  progress 
on  the  evaluation  of  average  run  lengths  in  discrete  time^i  Although  the  continuous  time 
results  are  not  especially  good  approximations  to  the  corresponding  quantities  in  discrete 
time,  they  provide  very  useful  comparative  information  on  which  to  base  selection  of  a 
stopping  rule. 

The  paper  is  organized  as  fellows.  The  Chiryaycv-Roberts  process  is  defined  hr  Section 

—  ✓ 

2^  and  shown  to  be  a  novel  diffusion  process  with  some  surprising  properties.  We  also  spec¬ 
ify  more  precisely  the  basis  for  our  comparison  of  the  two  procedures  and  give  the  results  of 
some  elementary  calculations.  These  developments  continue  in  Secton  3,  which  con  tampan 
asymptotic  evaluation.of  E„(T  —  v  j  T  >  v)  when  v  and  Eoo(T)  are  large.  In  Section  4  we 
define  a  modification  of  our  basic  procedure  and  give  an  asymptotic  evaluation  of  its  average 
run  length.  Numerical  comparisons  and  a  discussion  of  their  significance  are  contained ^ 
Section  Some  open  problems  are  mentioned  briefly  in  Section  6.  The  reader  whose  prin¬ 
cipal  interest  is  in  our  conclusions  may  wish  to  read  Section  2  (through  Proposition  l )  and 
then  skip  directly  to  Sections  5  and  6,  before  returning  to  the  derivations.  Our  conclusions 
are  roughly  these.  In  simple  situations  where  the  two  procedures  can  be  directly  compared. 
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^neither  seems  dramatically  superior  to  the  other.  However,  the  Shiryayev- Roberts  proce¬ 
dure  is  more  easily  adapted  to  complex  circumstances  and  consequently  warrants  additional 
study.^(§ee  Section  6  for  examples.) 

2.  Definition  of  the  Procedures,  Criteria  for  Comparison, 
and  Elementary  Operating  Characteristics. 

Suppose  momentarily  that  at  time  v  the  drift  of  W(t)  changes  from  0  to  some  known 
value  6  >  0.  Although  this  will  rarely  be  true,  it  is  possible  that  a  procedure  derived 
under  this  hypothesis  is  useful  e7en  if  the  drift  ft  after  time  u  is  unknown.  Also,  for  the 
sake  of  motivating  our  stopping  rule,  suppose  that  v  is  itself  a  random  variable  which  is 
exponentially  distributed  with  mean  1/A.  Then  the  posterior  probability  that  there  has 
been  a  change  before  time  t  given  the  data  until  t  is 

r(„  < ,  |  <  1(  =  - — -  ■*(»))  -  <;(. 

1  /0<Aexp(-A5)exp^{ir(t)-Wr(s)}-^-(t- j)/2]ds  +  exp(-At) 

Consider  the  rule  which  stops  and  declares  that  a  change  has  taken  place  when  this  posterior 
probability  exceeds  some  threshold  c  for  the  first  time.  (For  a  particular  less  structure 
Shiryayev,  1963,  has  shown  that  the  Bayes  rule  ha3  this  form.)  Fur  A  close  to  0  this 
stopping  rule  is  approximately 

T  =  Td  =  inf{t  :  [‘  exp[*{jy(t)  -  I7(*)}  -  S2(t  -  «)/2j<fo  >  £],  (1) 

Jo 

which  is  a  stopping  rule  proposed  by  Shiryayev  (19G3)  and  Roberts  (19C6). 

Page's  rule  is  similar,  but  is  motivated  by  maximum  likelihood  rather  than  Bayesian 
considerations.  It  is  defined  by  stepping  at 

j  =  inf{t  :  5[IV(t)  -  St/ 2  -  min{I7(s)  -  Ssj 2}]  >  c}.  (2) 

In  principle  we  would  like  to  choose  B  and  e,  so  that  Eoo{T)  =  E^(T),  then  compare 
E„{T  -  v  |  T  >  v)  and  E„(T  -  v  |  T  >  i/)  as  functions  of  both  v  raid  /»..  In  fact  our 
comparisons  are  basically  between  the  extreme  cases  v  =  0  and  v  =  oo.  Sometimes  they 
are  asymptotic  as  B  (hence  also  c)  — *  oo.  The  easiest  comparisons  are,  of  course,  when  v/e 
suppose  that  the  only  possible  value  of  ft  is  the  hypothesised  value  ft .  =  6;  but  we  shall  also 


0-~Y 
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consider  other  possible  values.  (La  Section  4  we  introduce  a  modification  of  the  stopping 
rule  (1)  which  is  designed  to  deal  with  the  case  of  unknown  ft.) 


u 

’ 

■  r*t 
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We  begin  with  a  detailed  examination  of  the  stopping  rule  (1).  Let 

R(t)  =  [‘ exp[6{V/(t)~  W(a)}  -  S2(t-  a)/2]da.  (3) 

.'o 

Since  for  fixed  a  exp[5{W(i)  -  W(«)}  -  S2(t  -  a)/ 2]  is  a  Poo- martingale  in  t  for  t  >  a,  it 
follows  that  R(t)  -  t  is  also  a  Poo-martingale.  Also  Poo{#(*)}  =  t.  An  easy  application  of 
optional  stopping  (Loeve,  1963,  p.  534)  yields 

Proposition  1.  E^T)  =  Poo{P(r)}  =  B. 

Remark:  The  martingale  property  of  R(t),  which  yields  a  simple  formula  for  EX(T),  proves 
to  be  very  useful  in  adapting  the  Shiryayev- Roberts  rule  (1)  to  deal  with  more  complicated 
problems.  See  Section  6  for  some  examples. 

From  (3)  it  is  easy  to  see  that  P(t)  i3  a  Markov  process  with  stationary  transition 
probabilities  (as  long  as  the  drift  of  VV(t)  does  not  change).  It  follows  from  Ito's  formula 
that  for  all  ft  the  P0)<l  stochastic  differential  of  R(t)  is  given  by 

dR(t)  =  {1  +  fiSR(t)}dt  +  6R(t)dW{t),  (4) 

where  VV(t),  0  <  t  <  oo,  is  standard  Brownian  motion  (with  drift  0).  Henre  under  Pj  .,  the 
differential  generator  cf  R(t)  is  given  by 

Df(x)  =  l-62z‘f(x)  +  (1  +  tfx)f'(x)  (x  >  0).  (5) 

From  (5)  and  standard  diffusion  theory  it  is  possible  to  compute  the  average  run  lengths 
Eo.M(T)  in  a  fairly  explicit  form.  A  convenient  reference  for  the  following  calculations  is 
Karlin  and  Taylor  (1981,  Chapter  15). 

The  scale  function  5,,(x)  of  the  process  R(t)  is  determined  by  integrating  the  relation 
S;(*)  =  exp{2/(£2z)-(2Mlogx)/*},  (G) 

and  the  speed  measure  is  given  by 


dMf,{x)  =  dx/S2x2S,ll(x). 


(7) 


It  is  easy  to  see  that 


/l{S,(l)-S,(z)}rfA4(z) 

Jo 


and  hence  0  is  an  entrance  boundary. 

Since  R(t)  is  a  Markov  process  with  stationary  transitions,  we  can  consider  the  pro¬ 
cess  starting  from  R(0)  —  x.  When  this  is  the  case  we  shall  write  E*  and  Fr  to  denote 
expectation  and  probability.  For  a  <  x  <  b  and  f?(0)  =  x.  let  N  —  inf{t  :  R(t)  (a.fc)}. 
Then  for  nonnegative  functions  A 

K %  y\{R(t)}dtj  =  j\(y)  G(x,r,a,i)dM,{y),  (9) 


where 


G{x,y;a,b)  =  2{S<1(x)  -  5p(a)}{5p(6)  -  S„(y)}/{S„(i)  -  5„(a)}  (2  <  y) 


=  G{y,  x\  a,  b) 


{*  >  y). 


Letting  a  — *  0,  then  x  — ♦  0  and  using  (8),  (9)  yields  the  following  result,  obtained  by 
Shiryayev  (1963)  in  the  special  case  p  =  6. 

Proposition  2.  For  the  stopping  rule  T  defined  by  (1)  and  for  all  p 


EoAT)  =  2  f  {S„iB)  -  S,(y)}JM,( y), 
Jo 


where  5  and  M  are  given  by  (6)  and  (7).  In  the  special  case  p  =  0  this  become?  E00{T)  =  3. 
in  agreement  with  Proposition  1.  In  the  special  case  p  =  5  (10)  yields 

£0.*(r)  =  2r2{!og/l  +  exp(l/.4)  f  (logz)exp(-x)dn.  f  1 1  > 

Ja -> 

where  A  =  S2D/ 2.  Letting  B  -*  00  yields 

EoAT)  =  25-2{log  A-  7  +  0{A~l  log  .4)},  (12) 

where  7  S  .5772  is  Euler’s  constant. 

Remark:  It  is  easy  to  evaluate  (10)  numerically.  For  some  purposes  one  probably  obtains 
more  insight  from  a  simple  asymptotic  expansion  like  (12).  It  is  possible  to  obtain  similar 
expansions  when  p  ^  6,  but  the  calculus  and  the  resulting  expressions  are  much  mere 
complicated.  The  details  are  omitted. 
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For  the  purpose  of  evaluating  E„(T  -  v  |  T  >  v)  as  v  — *  oo,  it  will  be  helpful  to  know 
the  Poo  limiting  distribution  of  R(t)  (of.  Section  3).  Since  P.*,  {-.?($)}  =  t.  it  is  somewhat 
surprising  that  (under  Poo)  R(t)  is  actually  recurrent. 

Proposition  3.  For  any  initial  state  R( 0)  =  x,  and  for  all  y  >  0 

lira  P^{P(t)  <  !/}  =  exp(-2 /S2y). 

t— oo 

Proof.  Let  r.  =  inf{t  :  R(t)  =  z}.  For  fixed  u  >  z  =  R{ 0)  let  r  denote  the  time  of  first 
return  to  x  after  passing  through  u.  Then 

£&(r)  =  E’x(ru)  +  ^(r,)f 

and  the  right  hand  side  may  be  evaluated  by  taking  appropriate  limits  \a  — *  0  or  to  — *  co' 
in  (9)  with  h  =  1.  The  result  is  that 

E’Jj)  =  2{50(u)  -  So(x)}M0(0,oo),  (13) 

which  is  finite  by  (6)  and  (7).  Let  H(t)  =  P,J,{P(t)  <  y}. 

By  the  standard  renewal  argument 

H(t)  =  P' {r  >  t,  R[i)  <  y}  +  [  K{t  -  s)P*  {r  6  d*}. 

JO 

By  (i3)  the  renewal  theorem  applies  to  yield 

lim  H{t)  =  f°P*  {r  >  t,R{t)  <  V}dt 

(  —  30  JQ 

=  I{R(t)  <y}rftl. 

The  numerator  can  be  evaluated  by  the  same  limiting  process  that  led  to  (13),  now  with 
h(z)  =  I(z  <  y ),  to  show  that 

lim  H(t)  =  [VdM0(s)  /  f°°dMo[z). 

*— 00  Jo  /  Jo 

Using  (6)  and  (7)  to  evaluate  these  integrals  completes  the  proof. 

The  process  V(t)  =  £(lF(t)  -  St/2  -  min{H,r(s)  -  S»/ 2}]  which  defines  Page’s  stepping 
rule  (2)  is  also  a  diffusion  process,  this  time  with  a  reflecting  barrier  at  zero,  so  the  same 
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theory  delivers  the  average  run  lengths  and  Poo  limiting  distribution  for  this  process  as  well. 
On  the  other  hand,  the  relation  of  (2)  to  the  stopping  rule  of  a  sequential  probability  ratio 
test,  which  was  noted  already  by  Page  (1054),  yields  a  more  elementary  computat^a  cl 
average  run  lengths;  and  the  P *  limiting  distribution  is  also  easily  computed  by  standard, 
direct  arguments.  We  summarize  the  relevant  results  in  the  following  proposition: 

Proposition  4.  Let  T  be  defined  by  (2).  For  all  p 

Eo,n(f)  =  2(2/i  -  i)_2[c(2/i  -  8)/6  +  exp{-c(2/i  -  5)/6}  -  1],  (14) 

where  for  /i  =  5/2  the  right  hand  side  of  (14)  is  defined  to  be  (c/J )2.  For  all  x  >  0,  y  >  0, 
as  t  — ►  oo, 

P£{Y(t)<y}-l-exp(-:d.  (15) 

By  Proposition  1  and  (14),  equating  Uoo(T)  with  Eoo(T)  means  setting  A  =  exp(— e)  - 
c  -  1,  where  A  =  52B/ 2.  This  can  be  asymptotically  inverted  as  A  ■-*  oo  to  yield  c  = 
log  (A)  +  {Iog(v4)  +  1}/'A  +  o(l/A).  For  the  special  case  n  —  5,  substitution  of  this  relation 
into  (14)  yields 

E0,s(T)  =  2r={logU)  -  1  +  0(l/A)}.  (1C) 

Comparing  (12)  ands  (16),  we  see  that  E0j(T)  is  asymptotically  smaller  than  Eo,i(T),  but 
the  difference  is  not  large  enough  to  indicate  a  strong  preference  for  Page's  procedure.  For 
fi  ^  5,  the  procedures  are  compared  numerically  in  Section  5. 

3.  Asymptotic  Evaluation  of  ev{t  -  v  \  T  >  v)  as  v  and  d  —  oo. 

In  this  section  we  try  to  compare  T  and  T  when  the  time  of  change,  v ,  is  large  and  the 
stopping  rule  has  not  yet  signaled  a  change.  The  optimality  considerations  of  Poliak  (1984) 
(see  abo  Appendix  A)  suggest  that  the  Shi ryayev- Roberts  rule  should  be  better  than  Page's 
under  these  conditions.  We  shall  see  below  that  this  expectation  is  essentially  correct,  but 
the  difference  is  usually  small. 
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A  possible  formulation  is  to  evaluate  lim,,  EV(T  —  v  \  T  >  v)  with  B  fixed.  However, 
this  seems  difficult  technically  aud  abo  inappropriate  conceptually.  In  mo:t  applications 
wc  envision  that  the  cost  of  a  false  alarm  is  substantial;  and  hence,  at  least  insofar  a?  we 


are  able  to  make  crude  prior  judgments  about  the  range  of  u,  we  should  choose  D  roughly 
comparable  to  v  -  or  perhaps  larger.  Here  we  shall  suppose  that  v  and  B  are  simultaneously 
large,  which,  by  virtue  c:  the  following  lemma,  allows  us  to  utilize  the  Px,  unconditional 
limiting  distribution  of  R(s)  calculated  in  Proposition  3. 

Again  let  E 1  ( P *)  denote  expectation  (probability)  when  R( G)  =  x. 

Lemma  1.  For  any  x,  y,  t  >  0 

P^{R(t)  <y\7>t}>  P^{R(t)  <  y). 

Also  Poo{f?(t)  <  y)  >  P00{R(oo)  <  y},  where  R(oo)  denotes  a  random  variable  having  the 
distribution  evaluated  in  Proposition  3.  As  t  and  B  — »  oo 

Poe  W)  <  V  I  T  >  t)  -  Poo(P(oo)  <  y). 


Since  this  result  seems  potentially  of  wider  interest  than  the  specific  technical  require¬ 
ments  of  the  present  paper,  and  since  our  proof  uses  essentially  none  of  the  structure  of 
R(t)  beyond  the  fact  that  it  is  stochastically  monotone,  a  complete  proof  of  Lemma  1  will 
be  published  elsewhere. 

Theorem  1.  Let  A  —  6~B/2.  Suppose  ft  >  0  and  B,  v  — *  oo.  For  fi  >  6/2 

E,{T  -u\T>v)  =  {6(n  -  £/2)}-I(!og  A  -  7  -  -  6/2)}  +  o(l}]. 

For  fi  —  6  /  2 

EV(T  -  v  |  T  >  v)  =  5-2{(log  A  -  q)2  +  t2/6  +  o(l)}. 

For  /i  <  6/2 

EV{T  -  1/  |  T  >  u)  m  { 6(6/2  -  #»)}“* -  2{i/6)  -  logA+  7  -  (1  -  2 p/6)  +  o(l)}. 

Here  T  denotes  the  gamma  function  and  7  =  r'(l)  2  .5772  is  Euler's  constant. 

Proof.  We  start  from 

EV,,(T  -  v  |  T  >  v)  =  f°  FZJ7o)Pco{n(v)  edx\T>  u} 

J  a 

=  E°0JTn)  -  /  €  d-  I  7  >  v). 

Jo 


To  complete  the  proof,  we  first  replace  the  measures  Px{P(u)  €  dx  j  T  >  v}  by  their  limit. 
Pcc{i?(oo)  G  dx}  (cf.  Lemma  1),  and  then  evaluate  the  resulting  integrals  (cf.  Proposition 
3  and  (G),  (7)). 

To  justify  replacing  the  distributions  G  dz  \  T  >  u}  by  their  limit,  it  suffices 

to  show  that  g(x)  =  Ej}^ ( T2 }  is  uniformly  integrable  with  respect  to  these  distributions. 
But  by  Lemma  1,  the  distributions  are  stochastically  smaller  than  their  limit;  and  since  g 
is  monotone  increasing,  it  is  uniformly  integrable  if  and  only  if  it  is  integrable  with  respect 
to  the  limiting  distribution.  It  follows  from  Proposition  2  and  some  calculation  that  for 
any  ft  >  0  there  exists  r  <  1  such  that  g(x)  =  0(xr).  From  Proposition  3  we  see  that 
PX{R( oo)  >  x}  ~  2/52x,  so  g  is  in  fact  integrable.  Evaluation  of  the  resulting  integrals  to 
obtain  the  results  stated  above  is  sketched  in  Appendix  B. 

An  essentially  identical  argument  applies  to  Page’s  procedure.  (In  fact,  as  noted 
above,  it  is  possible  to  abstract  the  essential  features  of  Lemma  1  to  cover  both  processes 
simultaneously.)  We  record  the  final  result  as  Theorem  2. 

Theorem  2.  For  T  defined  by  (2).  as  e  and  /'  — »  oo,  E„(f  —  v  \  T  >  i/)  is  given  approxi¬ 
mately  by  the  following  expressions  for  the  respective  cases  (i)  ft  ^  6/2  and  (ii)  /t  =  5/2: 

(i)  {S[fi  -  5ID)-l\c  -  1  -  62/{ 2m(2m  -  *)}  +  {S/( 2m  -  5)}  exP{c(l  -  2m/*)}  +  °(D}, 
and 

(ii)  S~2(c2  —  2)  +  o(  1). 

If  c  b  defined  by  the  relation  A  =  exp(c)  -  e  -  1,  so  that  E^T)  =  Eoo{T),  the  results 
given  in  Theorem  2  are  easily  rewritten  to  be  directly  comparable  to  those  of  Theorem 
1.  For  examle,  EU(T  —  v  |  T  >  v)  i3  asymptotically  smaller  than  E„(f  -  v  i  T  >  i/)  if 
ft  <  (1  —  7 )~lo  =  1.135,  but  rot  otherwise.  The  differences  are  not  large  if  ft  >  6/2.  These 
results  explain  the  general  conclusions  of  Roberts  (19GG),  which  in  his  case  were  based  on 
a  Monte  Carlo  experiment. 

Remark:  Sliiryayev  (1003)  considers  the  problem  of  “detection  of  destruction  of  a  station¬ 
ary  regime,’’  which  has  some  technical  points  in  common  with  the  preceding  discussion,  but 
is  conceptually  quite  different.  See  Appendix  C. 


4.  Unknown  n, 


In  the  preceding  sections  we  have  studied  procedures  which  are  approximately  optimal 
tinder  the  assumption  that  at  time  v  the  change  from  n  =  0  is  to  a  known  value  /t  =  5 . 
Since  this  assumption  is  never  satisfied  in  practice,  we  have  evaluated  these  procedures  for 
general  values  of  /i.  Now  we  consider  a  generalization  of  the  Shiryayev- Roberts  rule  for  the 
case  of  unknown  ft. 


Let  G  denote  a  probability  on  (0,oo).  (The  distribution  G  could  be  interpret'd  as  a 
prior  distribution  for  the  value  of  p  after  the  change  at  time  v.)  Let  Rf(t)  be  defined  by 
(3),  where  now  we  use  a  subscript  to  denote  dependence  on  the  value  of  <’>.  Define 


Ry(t)dG(y), 


and  let 


T  =  fa  =  inf {i :  R(t)  >  D } 


(17) 


( 1  *5) 


be  defined  as  in  (1),  but  with  R  in  place  of  R.  Since  f?„(t)  -  t  is  a  Px  martingale  with 
mean  equal  to  0  for  every  value  of  y,  it  follows  that  R{t)  —  t  is  also  a  martingale  with 
mean  0.  Hence,  exactly  as  in  Proposition  1 


E*a(T)  =  B. 


(19) 


Before  turning  to  an  evaluation  of  Eo<tt{T),  we  note  that  an  analogous  modification  of  Page's 
rule  was  defined  and  studied  in  Poliak  and  Siegmund  (1975).  Unfortunately,  however,  we  do 
not  know  an  approximation  to  the  Poo  average  run  length  of  this  procedure  because  there  is 
no  similar  martingale  structure.  Hence  we  were  limited  then,  as  we  are  now,  to  examining 
the  Bo,,,  average  run  length.  In  Secton  G  we  give  other  examples  of  processes  for  which  a 
version  of  the  Shiryayev- Roberts  rule  and  the  Poo  average  run  length  are  easily  obtained, 
while  the  corresponding  analogue  of  Page’s  procedure  seems  much  more  difficult  to  study. 

Theorem  3.  Let  n  >  0  and  suppose  that  in  some  neighborhood  of  j*  the  measure  G  has 
positive,  continuous  density  g.  Then  as  D  — *  oo 


Eo,»(T)  =  ;r2[2  log  B  +  log  log  D  -  1  -  7  -  iog{2»v/2(Ai)}  -  \og(2/n2)  +  0(1)]. 


The  baoic  idea  of  the  proof  of  Theorem  3  is  already  apparent  in  the  arguments  of 
Poliak  and  Siegmunu  (1975)  or  in  the  more  general  extension  of  Lai  and  Siegmund  (1979). 
A  completely  rigorous  development  of  the  corresponding  result  in  discrete  time  has  been 
given  by  Poliak  (1983).  For  the  sake  of  completeness,  we  give  a  brief  outline  in  Appendix 
D. 

By  comparing  Theorem  3  with  Propositions  3  and  4,  one  sees  that  it  is  possible  asymp¬ 
totically  to  do  as  well  to  first  order  as  in  the  case  of  known  fi  =  8.  For  moderate  sample 
sizes,  the  higher  order  terms  play  an  important  role,  which  is  investigated  numerically  in 
the  next  section. 

5.  Numerical  Comparisons. 

Tables  1  and  2  compare  T  given  by  (1)  with  T  given  by  (2)  for  the  cases  v  =  0  and 
v  —*  oo,  respectively.  The  values  of  B  and  e  were  chosen  so  that  Eoo(T)  and  E-a{T)  are 
about  790,  which  seems  appropriate  for  a  variety  of  industrial  sampling  inspection  schemes. 
The  entries  in  Table  1  were  computed  by  integrating  (10)  numerically  and  by  applying  (14). 
For  Table  2  the  asymptotic  results  of  Theorems  1  and  2  were  used. 

Remark:  The  numerical  integration  in  (10)  can  be  quite  time  consuming.  For  the  impor¬ 
tant  range  fi  >  S/2  it  i3  possible  to  show  that  Eo,ii(T)  =  {6(n  —  £/2)}_1{log  3  -I-  const.  + 
<?(!)},  so  it  is  possible  to  do  the  numerical  computation  for  a  moderate  value  of  B  and 
obtain  approximations  for  other  B  from  this  one  value.  Alternatively,  in  certain  ranges  one 
can  evaluate  the  integrals  as  infinite  series  to  speed  up  the  computations. 

Tables  1  and  2  show  that  the  Shiryayev- Roberts  and  Page  rules  are  almost  indistin¬ 
guishable  at  fi  =  6.  For  larger  values  of  n  Page’s  rule  does  slightly  better,  while  for  smaller 
values  the  Shiryaycv-Roberts  rule  seems  preferable.  The  greatest  percentage  differences  are 
those  favoring  Page’s  rule  when  ft  is  large  and  v  =  0.  When  v  — ♦  oo,  the  difference  favoring 
Page’s  rule  decreases  while  the  small  difference  in  favor  of  Shiryayev-Roberts  remains  about 
what  it  is  for  v  —  0. 

Since  the  choice  of  S  is  to  some  extent  arbitrary,  it  is  interesting  to  observe  that  in 
Tables  1  and  2  the  choice  S  =  1/2  yields  much  smaller  average  run  lengths  for  small  /»  at 


Table  2 


Comparison  for  Large  v  and  B 


EV,„(T  -  v  |  T  >  (/) 

Eu^t  -v\T>v) 

(B  =  792,  S  =  1) 

{62B/2  =  e1  -  1  -  c) 

.25 

111 

124 

.50 

31 

34 

1.0 

S.8 

9.0 

2.0 

3.4 

3.3 

( B  =  791,  S  =  .5) 

(62B/  2  =  —  1  —  c) 

.25 

71 

76 

.50 

24 

25 

1.0 

9.8 

9.4 

2.0 

4.4 

4.1 

12 


a  relatively  minor  cost  for  large  y.  than  does  the  choice  8  =  1.  This  suggests  that  one  may 
wish  to  use  a  smaller  value  of  6  than  the  change  that  one  “expects’  to  occur.  In  some 
applications,  even  the  relatively  small  increase  in  average  run  length  for  large  fi  entailed  by 
choosing  a  small  5  may  be  too  costly  to  rnahe  this  strategy  seem  reasonable. 

Table  3  uses  Theorem  3  to  give  comparable  results  for  T  defined  by  (18)  with  G  the 
distribution  of  the  absolute  value  of  a  standard  normal  random  variable.  For  values  of  B 
in  the  indicated  range,  the  use  of  a  mixture  to  define  T  seems  generally  inferior  to  the 
practice  of  using  an  appropriate,  fixed  value  of  8  to  define  either  a  Page  or  a  Roberts  type 
rule.  The  situation  changes  for  larger  values  of  B.  Some  results  are  contained  in  Table 
4.  which  shows  that  for  some  inefficiency  at  /i  =  8.  one  can  do  better  for  extreme  p  by 
using  the  mixture  stopping  rule  T.  Since  the  stopping  rule  T  i3  asymptotically  first  order 
optimal  simultaneously  for  all  fi  >  0,  still  larger  values  of  B  will  tend  to  favor  T  over  T. 
However,  this  asymptotic  optimality  seems  to  take  over  so  slowly  as  to  be  irrelevant  for 
many  problems. 

Table  3 

Average  Run  Lengths  for  T  Defined  by  ( 18) 
with  dG(y)  =  (2/5r)l(,2exp(-y2/2)dy,  B  =  792 


) 

.25 

.50 

792 

123 

40 

1.0 

12.< 

i.5 

6.1 

1.0 

4.1 

Table  4 


Comparisons  of  T  and  T  for  large  B 


M 

3oA*) 

(B  =  5914) 

(c  =  8,  6  = 

1)  (c  =  G.G2,  6  =  .5) 

C 

5944 

f  344 

.25 

202 

397 

175 

.50 

57 

64 

45 

1.0 

16 

14 

17 

1.5 

8.2 

7.5 

10.3 

2.0 

5.2 

5.1 

7.4 

6.  Open  Problems. 

We  believe  that  the  evidence  presented  above  indicates  that  there  is  no  persuasive 
scientific  reason  for  preferring  the  Page^  stopping  m!e  to  that  suggested  by  Shiryayev  and 
Roberts,  or  vice  versa.  Depending  on  the  specific  context,  one  or  the  other  might  be  slightly 
preferable;  but  in  general  the  choice  may  be  based  essentially  on  convenience.  In  this  s  'etieu 
we  indicate  some  open  problems,  for  which  versions  of  the  Shiryaycv-Roberts  rule  are  more 
or  less  obvious  and  can  be  studied  by  techniques  similar  to  those  developed  here.  T age's 
rule,  on  the  other  hand,  seems  les3  suited  to  deal  with  there  new  problems.  The  important 
distinguishing  feature  of  the  Shiryayev- Roberts  rule  is  the  martingale  proptriy  utilized  in 
Proposition  1,  which  seems  to  have  no  analogue  for  cusum  procedures. 

A  particularly  interesting  article  on  the  applied  use  of  ctruim  stopping  ruler  s  Wilson. 
et  a /.  (1970).  In  order  to  control  the  quality  of  radioimmunoassays,  plasmas  of  known 
composition  were  occasionally  submitted  to  be  assayed,  with  the  result  regarded  as  a  normal 
random  variable  with  unknown  mean  and  variance.  The  process  is  regarded  as  in  control  if 
the  mean  of  this  random  variable  is  the  Known  composition  and  if  the  variauce  is  '‘small." 
The  stopping  rule  should  detect  a  change  (cither  increase  or  decrease)  in  the  mean  value  or 
an  increase  in  the  variance.  Moreover,  the  target  value  for  the  mean  is  a  known  quantity; 
but  in  the  case  of  the  variance  the  target  is  whatever  can  be  achieved  by  caren.l  application 
of  the  assaying  method  -  hopefuly  small  -  but  there  is  no  a  priori  value  which  cue  knows 


This  problem  differs  in  three  important  respects  from  the  very  simple  model  described 
above:  (a)  two-sided  alternatives,  (b)  a  multidimensional  parameter  space,  and  (c)  an 
unknown  initial  parameter  value.  We  shall  give  here  a  brief  discussion  of  each  of  these 
issues,  but  defer  to  a  subsequent  paper  a  more  thorough  investigation. 

The  issue  of  two-sided  alternatives  is  the  simplest,  at  least  in  fairly  symmetric  prob¬ 
lems.  The  standard  modification  of  Page's  procedure  is  to  run  two  one-sided  cusum  tests 
simultaneously,  stopping  as  soon  as  at  least  one  indicates  that  a  change  has  occurred.  See 
van  Dobben  de  Bruyn  (1968)  for  a  complete  discussion.  An  appropriate  modification  of 
the  Shiryayev-Roberts  rule  would  be  to  take  a  mixture  of  processes  as  in  Section  4,  but 
with  the  mixing  measure  G  giving  positive  measure  to  both  positive  and  negative  values 
of  /*.  The  simplest  case  would  be  the  measure  putting  weight  1/2  on  +5  and  on  -S.  For 
simple  modifications  of  this  sort,  comparisons  of  two  procedures  yield  essentially  the  same 
conclusions  as  in  the  one-sided  case. 

For  a  multidimensional  parameter  space,  the  natural  generalisation  of  Roberts’  rule  is 
again  to  form  a  mixture  over  the  parameter  space  as  in  Section  4,  and  the  basic  theory  is 
much  as  in  the  one  dimensional  case.  To  obtain  a  Page  like  process,  it  is  possible  to  run 
simultaneous  cusum  procedures  (Wilson,  et  a/.,  1979)  or  use  the  method  of  mixtures  (Poliak 
and  Siegmund,  1975);  but  we  have  no  idea  what  this  does  to  the  average  run  length  under 

Poc 

In  many  respects  the  most  interesting  variation  arises  when  the  initial,  in  control, 
parameter  value  is  unknown.  If  the  probability  model  exhibits  the  appropriate  invariance 
under  a  group  of  transformations,  as  in  the  case  of  a  normal  mean  or  variance,  one  can 
define  a  Page  or  Shiryayev-Roberts  procedure  in  terms  of  a  maximal  invariant  function  cf 
the  data.  For  example,  to  detect  the  change  of  the  drift  of  Brownian  motion  from  an  initial 
unknown  value  ^  to  a  new  value  hq  +  6,  the  invariant  analogue  of  R(t)  defined  in  (3)  is 

rt 

f2*(t)=  /  exp(5{s:F(t)/t-W(s)}-^s(I-Wt)/2jdd. 
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Relative  to  the  appropriate  ir-fields  the  process  -  t  is  a  Poo  martingale.  Hence  if 


T*  =  inf{t :  R*{t)  =  B},  exactly  as  in  Proposition  1  we  have  that  E^(T*)  =  B.  Although 
an  analogous  Page  type  procedure  is  easily  defined,  we  have  no  idea  what  its  Po o  average 
run  length  is. 

An  entirely  new  feature  of  the  problem  of  the  preceding  paragraph  is  that  evaluation 
of  Ev(T*  —  v  \  T*  >  v)  only  at  the  extreme  values  of  u  =  0  and  v  — »  co  is  uninteresting. 
If  a  “change”  occurs  at  v  —  0,  it  cannot  be  detected  because  the  new  value  of  jti  cannot  be 
distinguished  from  the  initial  value.  If  the  change  takes  place  after  an  extremely  long  period 
of  time,  we  have  so  much  data  to  estimate  /t  that  we  are  effectively  back  in  the  situation 
where  a  is  known.  We  expect  to  discuss  this  model  in  a  subsequent  paper. 


Appendix  A 


We  sketch  here  the  optimality  considerations  of  Shiryayev  (1963)  and  Poliak  (1984). 
Assume  that  p  equals  either  0  or  S. 

Assuming  that  v  has  an  exponential  distribution  with  mean  1  /A  and  that  if  one  stops 
at  t,  the  loss  is  1  or  (t  -  v)  according  as  t  <  v  or  t  >  v,  Shiryaycv  (1963)  chows  that  the 
Bayes  rule  stops  at 

r(A,c)  =  infft  :  P{ v  <  t  |  IV (a),  a  <  £}  >  l>{ A,c)|. 

From  Shiryayev’s  formula  for  6(A,e)  it  follows  that  be  —  limA— o  &(A,c)/A  exists  and  satisfies 
cexp(6/6()  y~l  exp(-j /)dy  =  1.  Hence  To  defined  by  (1)  is  a  limit  of  Bayes  rules  for  a 
particular  c  =  co- 

It  is  possible  to  modify  To  slightly  to  make  its  risk  essentially  constant  in  v  and  hence 
to  make  Tb  approximately  minimax.  Let  R'{t)  denote  the  process  R[i)  started  off  from  the 
Poo-limiting  distribution  of  R(a)  given  To  >  a  as  a  oo.  Let  T'D  =  inf{t  :  R'(t)  >  B }. 
Then  as  B  — *  oo 

sup  Ev(T*b  -  v  |  TB  >  v)  —  inf  sup  E„(r  -  v  \  r  >  v)  +  o(l) 

1/  T  u 

=  2#-2{lcg(52Z?/2)-l-'y}  +  t>(l), 

where  the  inf  is  over  all  rules  r  with  E^r)  >  E~o (Tjj).  The  first  equality  follows  from 
arguments  similar  to  but  much  easier  than  those  of  Pcllak  (1984).  Theorem  3.  The  second 
is  a  consequence  of  Theorem  1  of  this  paper. 


Appsndlx  B 


Here  we  discuss  calculation  of  the  integrals  involved  in  the  proof  of  Theorem  1.  I. 
suffices  to  evaluate  (up  to  terms  converging  to  0  a s  B  —>  oo) 

E00.p{Td)  -  f3  ESjTx)d«M*h  (20) 

Jo 

where  A/a  denotes  the  speed  measure  Mo  normalized  to  be  a  probability  (this  is  the 
limiting  distribution  of  f?(t)),  and  by  (10)  Eq^iT?)  =  2  JJ  f*  dS,,{z)dMtt(]}).  Inverting  the 
order  of  integration  in  (20),  so  that  we  integrate  first  with  respect  to  Mo.  then  A/,,,  yields 

f°  E°0JT,)dM0(x)  =  2  lB{exp(-V62B)-exp(-i;62:)}  f' dM^dS^z). 

JO  Jo  *0 

The  first  term  on  the  right  hand  side  can  be  recognized  to  equal  Eg  p  (T0)c:^(-2/6-B). 

wnile  the  second,  after  another  inversion  in  the  order  of  integration,  becomes 

2 r'1  fB  cxp(— 2/£2y)  f 3  z-^dz  dy.  (21) 

Jo  Jv 

Putting  these  facts  together  shows  that  (20)  equals  the  sum  of  (2.1)  and 

^o%(rn){l-cxp(-2/52Bi}.  (22) 

As  H  — ►  oo,  for  any  p  >  0  Efj it,(Tg)  =  o(D),  so  (22)  is  asymptotically  negligible.  Hence  it 
remains  to  evaluate  (21),  which  is  a  tedious  but  fairly  straightforward  job  and  yields  the 
results  given  in  Theorem  1. 


Appendix  C 


Shiryayev  (1963)  discusses  ‘‘detection  of  destruction  of  a  stationary  regime,”  which  su¬ 
perficially  resembles  our  Section  3.  Here  we  attempt  to  indicate  some  important  differences. 

In  addition  to  our  basic  assumptions,  Shiryayev  assumes  (i)  when  the  stopping  rule 
(1)  or  (2)  indicates  a  change,  we  can  immediately  ascertain  whether  a  change  has  indeed 
occurred,  restarting  the  process  (from  0)  if  there  has  been  no  change;  and  (ii)  this  new 
process  (i.e.  the  original  process,  renewed  at  each  false  alarm)  has  been  running  for  an 
extremely  long  time  without  any  changepoint,  so  that  the  number  of  false  alarms  already 
observed  is  becoming  infinitely  large.  Mathematically  this  means  that  v  -*  oo,  perhaps 
B  -*  oo;  but  in  any  case  vjB  — *  oo. 

Shiryayev’s  formulation  is  presumably  reasonable  if  the  cost  associated  with  a  false 
alarm  is  relatively  small  compared  to  the  cost  of  observation  after  the  changepoint  v.  On 
the  other  hand,  we  envision  situations  where  the  cost  of  a  false  alarm  is  substantial,  and/or 
it  is  difficult  to  tell  immediately  whether  an  alarm  is  a  false  one.  Thus  we  have  taken  v  and 
B  simultaneously  large  with  no  assumed  relation  between  them. 

With  some  reformulation,  it  is  possible  to  bring  about  a  partial  unification  of  the  two 
viewpoints. 


Suppose  with  Shiryayev  that  at  each  false  alarm  the  detection  process  is  immediately 
restarted  from  scratch.  Let  Si.o-;,--  -  denote  the  times  of  successive  alarms,  and  let  L  — 
L(t)  =  ma x{k  :  5*  <  t)  be  the  number  of  alarms  before  time  t.  Then  L(t)  is  a  Pao-renewal 
process  with  renewal  epochs  S„.  Suppose  that  we  agree  to  measure  delay  in  detecting  a 
disorder  not  by  E„(Si  -  v  |  S\  >  i/)  as  we  have  in  the  rest  of  this  paper,  but  by 


EASlM+i  “»')=  r  EtiSi  -  1 1  5  >  t)Pv{u  -  SM„«  €  dt}. 
Jo 


Assuming  n  =  6,  Shiryayev  uses  renewal  theory  to  evaluate  (23)  as  v  -*  oo  and  then 
evaluates  this  limit  as  B  -*  oo.  Assume  now  that  v  and  B  simultaneously  become  infinitely 
large,  in  any  relation  whatever.  For  either  of  the  detection  rules  (I)  or  (2),  it  is  easy  to  use 
the  results  of  Theorems  1  and  2,  together  with  the  fact  that  the  Poo  distribution  of  u  - 
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becomes  more  diffuse  as  Ex(Si.)  —  oo,  to  show  that  the  asymptotic  evaluations  given 
Theorems  1  and  2  are  also  satisfied  by  the  new  criterion  (23)  for  any  n  >  0. 
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Appendix  D 


In  this  appendix  we  indicate  very  informally  the  ideas  leading  to  Theorem  3.  We 
assume  throughout  that  v  =  0  and  write  E„  to  denote  expectation. 

For  the  stopping  rule  (1)  with  (t  =  6,  we  have 

logB  =  (iW(T)  -  (i2T{2  -i-  log  f  exp{-fiW(a)  +  n2s/2}da, 

Jo 


so  by  Wald’s  identity 


\ozB=l-SE,i(T)  +  Et,  Jlog^Texp{-M^(s)  +  /iV2}^  ■  (24) 


As  D  —*  oo,  T  -  Tj  co  in  probability,  so 

r 


Et 


log  f 
J  0 


exp  (-jifV(a)  +  (i2a/2}da 


]=B'h  I 


exp{-#iiy(a)  +  fi2a/2}da 


+  »(i), 

(25) 

and  this  last  expectation  can  be  evaluated  exactly  by  consideration  of  (12),  (24),  and  (25). 


For  the  stopping  rule  (18),  some  algebra  yields 

log  D  =  nW{T)  -  h2T/2  -  \{W(T)  -  ?f}2/t  -  \  log{r/(2,r)} 

t  “  30  (20) 

+  log  [  Tll2v[Tl'2{y  -  W(t)/T}\  r  exp {-yW{a)  +  y2a/2}da  dG{y) 

Jo  Jo 

The  (random)  measure  dG(y)  =  fl^2<p[fl^2{y  -  W{T)IT}\dG[‘j)  beccm  es  progressively 
more  concentrated  around  W[T)/T  as  D  and  hence  T  —*  co,  and  with  overwhelming  im¬ 
probability,  W(T)/T  becomes  concentrated  arormd  (i.  Hence  as  E  —*  oo  the  expectation  of 
the  final  expression  in  (26)  converges  to  the  expectation  appearing  on  the  right  hand  side 
of  (25),  which  can  be  evaluated.  By  Wald’3  identity  and  some  additional  calculations  we 
have  as  D  — ♦  oo 

E^fiWlT)  -  fi2t/2}  =  l-v2E„(T), 

£m[{W(r)  -  (if  }2!T\  -  1, 

and 

^(logfr)}  ~  log{£?m(f)}  ~  Iog{(2 logi?)/V2)  • 

Substituting  these  results  into  (26)  yields  Theorem  3. 
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